Reordering methods for data locality improvement
نویسندگان
چکیده
Cache memories were invented to decouple fast processors from slow memories. However, this decoupling is only partial, and many researchers have attempted to improve cache use by program optimization. Potential benefits are significant since both energy dissipation and performance highly depend on the traffic between memory levels. But modeling the traffic is difficult; this observation has led to the use of heuristic methods for steering program transformations. In this paper, we propose another approach: we simplify the cache model and we organize the target program in such a way that an asymptotic evaluation of the memory traffic is possible. This information is used by our optimization algorithm in order to find the best reordering of the program operations, at least in an asymptotic sense. Our method optimizes both temporal and spatial locality. It can be applied to any static control program with arbitrary dependences. The optimizer has been partially implemented and applied to non-trivial programs. We present experimental evidence that the amount of cache misses is drastically reduced with corresponding performance improvements.
منابع مشابه
Using Hypergraphs to Improve Iteration Reordering Heuristics
Irregular applications exhibit poor performance on current computer architectures because of their inefficient use of the memory hierarchy. Figure 1 shows iteration over an edge list as an example of the types of memory references that occur in irregular applications. Run-time data and iteration reordering transformations have been shown to improve the locality of such loops and therefore the p...
متن کاملOn Improving the Memory Access Patterns During The Execution of Strassen's Matrix Multiplication Algorithm
Matrix multiplication is a basic computing operation. Whereas it is basic, it is also very expensive with a straight forward technique of O(N ) runtime complexity. More complex solutions such as Strassen’s algorithm exist that reduce this complexity to O(N log2 ); the recursive nature of such algorithms place a large burden on memory systems due to temporary storage and the lack of locality in ...
متن کاملA Data Reorganization Technique for Improving Data Locality of Irregular Applications in Software Distributed Shared Memory
Irregular applications are characterized by highly irregular and ne-grained data referencing patterns. When there is poor locality between the ne-grained data, serious false sharing can occur which has largely contributed to poor performance of irregular applications on page-based software distributed shared memory (DSM) systems. Partitioning data in irregular applications to improve data local...
متن کاملAdjacency-based data reordering algorithm for acceleration of finite element computations
Effective use of the processor memory hierarchy is an important issue in high performance computing. In this work, a part level mesh topological traversal algorithm is used to define a reordering of both mesh vertices and regions that increases the spatial locality of data and improves overall cache utilization during on processor finite element calculations. Examples based on adaptively create...
متن کاملPerformance optimization of irregular codes based on the combination of reordering and blocking techniques
The combination of techniques based on reordering data with classic code restructuring techniques for increasing the locality in the execution of sparse algebra codes is studied in this paper. The reordering techniques are based on, first modeling the locality in run-time, and then applying a heuristic for increasing it. After this, a code restructuring technique specially tuned for sparse alge...
متن کامل